Natural language recognition using distributed processing

ABSTRACT

A method and system for computer-based recognition of natural language data. The method is implemented on a distributed computer network and includes obtaining natural language data, such as digital ink handwriting, using an input device  415,  receiving the natural language data on a server  430  via a network, processing the natural language data using a recognizer  440  residing on the server  430  to produce intermediate format data  445 , transmitting the intermediate format data  445  to an application  450,  and decoding the intermediate format data  445  into computer-readable format data using the application  450  and context information associated with the application  450.

TECHNICAL FIELD

The present invention relates to a method of and system for naturallanguage recognition, and in particular, to a method of and system forcomputer-based recognition of natural language data implemented on adistributed computer network.

CO-PENDING APPLICATIONS

Various methods, systems and apparatus relating to the present inventionare disclosed in the following co-filed U.S. application, thedisclosures of which are incorporated herein by cross-reference:NPW012U.S.

CROSS REFERENCES

Various methods, systems and apparatus relating to the present inventionare disclosed in the following granted U.S. patents and co-pending U.S.applications filed by the applicant or assignee of the presentapplication: The disclosures of all of these granted U.S. patents andco-pending U.S. applications are incorporated herein by reference.10/409,876 10/409,848 10/409,845 09/575,197 09/575,195 09/575,15909/575,132 09/575,123 09/575,148 09/575,130 09/575,165 09/575,15309/693,415 09/575,118 09/609,139 09/608,970 09/575,116 09/575,14409/575,139 09/575,186 09/575,185 09/609,039 09/663,579 09/663,59909/607,852 09/575,191 09/693,219 09/575,145 09/607,656 09/693,28009/609/132 09/693,515 09/663,701 09/575,192 09/663,640 09/609,30309/610,095 09/609,596 09/693,705 09/693,647 09/721,895 09/721,89409/607,843 09/693,690 09/607,605 09/608,178 09/609,553 09/609,23309/609,149 09/608,022 09/575,181 09/722,174 09/721,896 10/291,52210/291,517 10/291,523 10/291,471 10/291,470 10/291,819 10/291,48110/291,509 10/291,825 10/291,519 10/291,575 10/291,557 10/291,66110/291,558 10/291,587 10/291,818 10/291,576 10/291,589 10/291,5266,644,545 6,609,653 6,651,879 10/291,555 10/291,510 19/291,59210/291,542 10/291,820 10/291,516 10/291,363 10/291,487 10/291,52010/291,521 10/291,556 10/291,821 10/291,525 10/291,586 10/291,82210/291,524 10/291,553 10/291,511 10/291,585 10/291,374 10/685,52310/685,583 10/685,455 10/685,584 10/757,600 09/575,193 09/575,15609/609,232 09/607,844 09/607,657 09/693,593 10/743,671 09/928,05509/927,684 09/928,108 09/927,685 09/927,809 09/575,183 09/575,16009/575,150 09/575,169 6,644,642 6,502,614 6,622,999 09/575,14910/322,450 6,549,935 NPN004US 09/575,187 09/575,155 6,591,884 6,439,70609/575,196 09/575,198 09/722,148 09/722,146 09/721,861 6,290,3496,428,155 09/575,146 09/608,920 09/721,892 09/722,171 09/721,85809/722,142 10/171,987 10/202,021 10/291,724 10/291,512 10/291,55410/659,027 10/659,026 09/693,301 09/575,174 09/575,163 09/693,21609/693,341 09/693,473 09/722,087 09/722,141 09/722,175 09/722,14709/575,168 09/722,172 09/693,514 09/721,893 09/722,088 10/291,57810/291,823 10/291,560 10/291,366 10/291,503 10/291,469 10/274,81709/575,154 09/575,129 09/575,124 09/575,188 09/721,862 10/120,44110/291,577 10/291,718 10/291,719 10/291,543 10/291,494 10/292,60810/291,715 10/291,559 10/291,660 10/409,864 10/309,358 10/410,48410/683,151 10/683,040 09/575,189 09/575,162 09/575,172 09/575,17009/575,171 09/575,161 10/291,716 10/291,547 10/291,538 10/291,71710/291,827 10/291,548 10/291,714 10/291,544 10/291,541 10/291,58410/291,579 10/291,824 10/291,713 10/291,545 10/291,546 09/693,38809/693,704 09/693,510 09/693,336 09/693,335 10/181,496 10/274,11910/309,185 10/309,066 10/778,090 10/778,056 10/778,058 10/778,06010/778,059 10/778,063 10/778,062 10/778,061 10/778,057 10/782,89410/782,895 10/786,631 10/793,933 10/804,034 10/815,621 10/815,61210/815,630 HYC004US 10/815,638 10/815,640 10/815,642 HYC008US 10/815,64410/815,618 10/815,639 HYD001US 10/815,647 10/815,634 10/815,63210/815,631 10/815,648 10/815,614 10/815,645 10/815,646 HYG009US10/815,620 10/815,639 HYG012US 10/815,633 10/815,619 HYG015US 10/815,61410/815,636 10/815,649 10/815,609 10/815,627 10/815,626 HYT004US10/815,611 10/815,623 10/815,622 HYT008US 10/815,625 10/815,62410/815,628 10/831,232 10/831,242 NPS059US NPA141US NPT039US NPT025USNPP043US NPA150US NPT024US NPP040US NPT040US NPT041US NPT042US NPT043USNPT044US NPK007US NPK006USThe disclosures of all of these granted U.S. patents and co-pending U.S.applications are incorporated herein by reference. Some patentapplications are temporarily identified by their docket number. Thiswill be replaced by the corresponding application number when available.

BACKGROUND ART

Recent advances in pattern classification have enabled the developmentof sophisticated software systems that can recognize natural languagedata (i.e. natural language user input) such as speech (see for exampleL. Rabiner and B. Juang, “Fundamentals of Speech Recognition”, PrenticeHall, Englewood Cliffs, N.J., 1993) or handwriting (see for example G.Lorette, “Handwriting Recognition or Reading? Situation At The Dawn ofthe 3rd Millennium”, Advances In Handwriting Recognition, Series inMachine Perception and Artificial Intelligence, Vol. 34, pp. 3-15, WorldScientific Publishing Co. 1999).

These applications allow users to communicate with a computerised systemin a natural and convenient way, and permit the automation of tasks thatpreviously required human input. Some examples of such applicationsinclude interactive voice response (IVR) systems, automatedcheque-processing systems and automated form data-entry systems. Inaddition, the growth of networked computing and the Internet has enabledthe development of complex distributed systems, and the existence ofopen, standardized protocols has allowed the integration of end-userdevices, centralized servers, and applications. An example of athree-tiered distributed system architecture is depicted in FIG. 1(prior art), illustrating a system 100 which includes a client layer110, network layer 120 and application layer 130. Client device 140communicates with one or more servers 150 which in turn communicate withone or more applications 160. The combination of distributed computingand pattern recognition techniques has made possible the development ofsystems such as Netpage™ by Silverbrook Research Pty Ltd, an interactivepaper-based interface to online information. Systems such as this giveusers the ability to interact with information from any location thatprovides network connectivity (including wireless network access) usingfamiliar human-communication techniques such as handwriting or speech.

The basic processing steps of presently known pattern recognitionsystems are depicted in FIG. 2 (prior art). Processing begins when aninput device 210 generates a signal 220 that is to be recognized by thesystem 100 (that is, to be classified as belonging to a specific classor sequence of class elements). Usually, one or more pre-processingprocedures 230 are applied to remove noise and produce a normalizedsignal 240, which is then segmented 250 to produce a stream of primitiveelements 260 required for a classification procedure 270. Note thatoften this segmentation 250 is “soft”, meaning that a number ofpotential segmentation points are located, and the final segmentationpoints are resolved during classification 270 or context processing 290.

The segmented signal 260 is then passed to a classifier 270 where arepresentative set of features is extracted from the signal and used incombination with a pre-defined model 275 of the input signal to producea set of symbol hypotheses 280. These hypotheses 280 give an indicationof the probability that a sequence of segments within the signalrepresent a basic symbolic element (e.g. letter, word, phoneme, etc.).After classification 270, the context-processing module 290 uses thesymbol hypotheses 280 generated by the classifier 270 to decode thesignal according to a specified context model 295 (such as a dictionaryor character grammar). The result 297 produced by the context processing290 is passed to the application 299 for interpretation and furtherprocessing.

Natural language input is inconsistent, noisy, and ambiguous, leading topotential recognition and decoding errors. However, high recognitionaccuracy is required for pattern recognition applications to operatesuccessfully, since mistakes can be expensive and frustrating to users.As a result, recognition systems should make use of as much contextualinformation as possible to increase the possibility of correctlyrecognizing the natural language input. For example, when recognizing asignal that must represent a country name, the recognition system canuse a pre-defined list of valid country names to guide the recognitionprocedure. Similarly, when recognizing a phone number, a limited symbolset (i.e. digits) can be used to constrain the recognition results. Theproblem domain for many pattern recognition systems is inherentlyambiguous (i.e. many of the input patterns encountered during processingcannot be accurately classified without further information from adifferent source).

The following discussion refers to handwriting by way of backgroundinformation, however, the present invention should not be considered tobe limited to application to only handwriting as the form of naturallanguage data input.

Digital ink is a digital representation of the information generated bya pen-based input device. Generally, digital ink is structured as asequence of strokes that begin when the pen device makes contact with adrawing surface and ends when the pen-based input device is lifted. Eachstroke comprises a set of sampled coordinates that define the movementof the pen-based input device whilst the pen-based input device is incontact with the drawing surface.

As an example, one of the major issues faced in the development ofhighly accurate handwriting recognition systems is the inherentambiguity of handwriting (e.g. the letters ‘u’ and ‘v’, ‘t’ and ‘f’, and‘g’ and ‘y’ are often written with a very similar appearance and arethus easily confused). Human readers rely on contextual knowledge tocorrectly decode handwritten text, and as a result a large amount ofresearch has been directed at applying syntactic and linguisticconstraints to handwritten text recognition (see for example: H. Beigiand T. Fujisaki, “A Character Level Predictive Language Model and ItsApplication to Handwriting Recognition”, Proceedings of the CanadianConference on Electrical and Computer Engineering, Toronto, Canada, Sep.13-16, 1992; U. Marti and H. Bunke, “Handwritten Sentence Recognition”,Proceedings of the 15th International Conference on Pattern Recognition,Barcelona, Spain, Volume 3, pp. 467-470, 2000; D. Bouchaffra, V.Govindaraju, and S. Srihari, “Postprocessing of Recognized Strings UsingNonstationary Markovian Models”, IEEE Transactions Pattern Analysis andMachine Intelligence, 21(10), pp. 990-999, October 1999; J. Pitrelli andE. Ratzlaff, “Quantifying the Contribution of Language Modeling toWriter-Independent On-line Handwriting Recognition”, Proceedings of theSeventh International Workshop on Frontiers in Handwriting Recognition,Amsterdam, Sep. 11-13, 2000; R. Srihari, “Use of Lexical and SyntacticTechniques in Recognizing Handwritten Text”, ARPA Workshop on HumanLanguage Technology, Princeton, N.J., March 1994; and L. Yaeger, B.Webb, and R. Lyon, “Combining Neural Networks and Context-Driven Searchfor On-Line, Printed Handwriting Recognition in the Newton”, AIMagazine, Volume 19, No. 1, pp. 73-89, AAAI 1998).

The increasing use of pen-based computing and the emergence ofpaper-based interfaces to networked computing resources (see forexample: Anoto, “Anoto, Ericsson, and Time Manager Take Pen and Paperinto the Digital Age with the Anoto Technology”, Press Release, Apr. 6,2000; and Y. Chans, Z. Lei, D. Lopresti, and S. Kung, “A Feature BasedApproach For Image Retrieval by Sketch”, Proceedings of SPIE Volume3229: Multimedia Storage and Archiving Systems II, 1997) has highlightedthe need for techniques to interpret digital ink. Pen-based computingallows users to interact with applications.

As a result of the progress in pen-based interface research, handwrittendigital ink documents, represented by time-ordered sequences of sampledpen strokes, are becoming increasingly popular (J. Subrahmonia and T.Zimmerman: Pen Computing: Challenges and Applications. Proceedings ofthe ICPR, 2000, pp. 2060-2066). Handwriting typically involves writingin a mixture of writing styles (e.g. cursive, discrete, run-on etc.), avariety of fonts and scripts and different layouts (e.g. mixing drawingswith text, various text line orientations etc.).

Presently, handwriting recognition accuracy remains relatively low, andthe number of errors introduced by recognition (both for the databaseentries and for the handwritten query) means that present techniques donot work well. The process of converting handwriting into text resultsin the loss of a significant amount of information regarding the generalshape and dynamic properties of the ink. In many handwriting styles(particularly cursive writing), the identification of individualcharacters is highly ambiguous.

Similar work has been performed in the field of speech recognition,natural language processing, and machine translation.

Some known natural language recognition systems currently exist.Paragraph, Inc. offers a network-based distributed handwritingrecognition system called “NetCalif” (ParaGraph, Handwriting Recognitionfor Internet Connected Device, November 1999) that is based on theirCalligraphy handwriting recognition software. The user's naturalhandwriting—cursive, print, or a combination of both—is captured byclient software, then transmitted from an Internet-connected device tothe NetCalif servers where it is converted and returned as typewrittentext to the client device.

Philips has developed “SpeechMagic”, a client/server-based, professionalspeech recognition software package (Philips, SpeechMagic 4.0, 2000).This system supports specialized vocabularies (called ConTexts) anddictation, recognition, and correction can be done, independently of thelocation, across a LAN, WAN, or the Internet.

In a networked information or data communications system, a user hasaccess to one or more terminals which are capable of requesting and/orreceiving information or data from local or remote information sources.The information source, in the present context, may be a databaseassociated with an application. In such a communications system, aterminal may be a type of processing system, computer or computeriseddevice, personal computer (PC), mobile, cellular or satellite telephone,mobile data terminal, portable computer, Personal Digital Assistant(PDA), pager, thin client, or any other similar type of digitalelectronic device. The capability of such a terminal to request and/orreceive information or data can be provided by software, hardware and/orfirmware. A terminal may include or be associated with other devices,for example a pen-based input device for handwriting input or amicrophone for speech input.

An information source can include a server, or any type of terminal,that may be associated with one or more storage devices that are able tostore information or data, such as digital ink, for example in one ormore databases residing on a storage device. The exchange of information(i.e., the request and/or receipt of information or data) between aterminal and an information source, or other terminal(s), is facilitatedby a communication means. The communication means can be realised byphysical cables, for example a metallic cable such as a telephone line,semi-conducting cables, electromagnetic signals, for exampleradio-frequency signals or infra-red signals, optical fibre cables,satellite links or any other such medium or combination thereofconnected to a network infrastructure.

The reference to any prior art in this specification is not, and shouldnot be taken as, an acknowledgment or any form of suggestion that suchprior art forms part of the common general knowledge.

DISCLOSURE OF INVENTION

The present invention seeks to provide improved natural languagerecognition, performed in a distributed system. This broadly includes amethod of forwarding intermediate format data, generated by a recognizermodule, to an application for context processing (i.e. decoding).

In another form, the present invention also seeks to provide means formanaging multiple recognizers, user-specific dictionaries, anduser-specific training of recognizers, desirable to make patternrecognition systems more accurate and flexible.

According to a first broad form of the invention, there is provided amethod of providing computer-based recognition of natural language data,comprising the steps of: generating natural language data; and,transmitting the natural language data to a server; wherein, the serveris programmed and configured to process the natural language data usinga recognizer to produce intermediate format data, and is further capableof transmitting the intermediate format data to an application, andfurther wherein, the intermediate format data is decoded intocomputer-readable format data using context information.

According to a second broad form of the invention, there is provided amethod for computer-based recognition of natural language data,comprising the steps of: receiving natural language data at a serverfrom a remote input device; processing the natural language data using arecognizer residing on the server to produce intermediate format data;and, transmitting the intermediate format data to an application;wherein, the application is programmed and configured to decode theintermediate format data into computer-readable format data usingcontext information associated with the application.

According to a third broad form of the invention, there is provided amethod of providing computer-based recognition of natural language datafor interaction with an application, wherein natural language data isreceived at a server from a remote input device, and the serverprocesses the natural language data using a recognizer residing on theserver to produce intermediate format data, the method comprising: theapplication receiving the intermediate format data from the server; and,the application decoding the intermediate format data intocomputer-readable format data using context information associated withthe application.

According to specific, but non-limiting, embodiments of the invention,the natural language data is digital ink or speech; the digital ink isof a type from the group of: handwriting, textual, numerical,alphanumercial, pictorial or graphical; and/or the natural language dataincludes one or more of: normalizing the data; segmenting the data; andclassifying the data.

According to further specific, but non-limiting, embodiments of theinvention, the recognizer is implemented using software or hardware; theintermediate format data is a Directed Acyclic Graph (DAG) datastructure; the DAG data structure is a matrix containing the processingresults of segments of the natural language data; the intermediateformat data includes segmented time-series classifier data; the naturallanguage data is derived from protein sequencing, image processing,computer vision or econometrics; the application is remote to both theinput device and the server; the application resides on the server;there is more than one recognizer, each recognizer controlled by arecognition management module; the application queries the recognitionmanagement module to identify a suitable recognizer to perform theprocessing; the context information is a user dictionary; the recognizeris able to be trained for a specific user; the input device isassociated with a paper-based interface provided with coded markings;the coded markings are a pattern of infrared markings; the input deviceis an optically imaging pen; and/or each paper-based interface isuniquely identified and stored on a network server.

According to a specific embodiment of the invention, there is provided amethod of recognising digital ink input by a user into a computer-baseddigital ink recognition system, the user interacting with a paper-baseddocument, the paper-based document having disposed therein or thereoncoded data indicative of a particular field of the paper-based documentand of at least one reference point of the paper-based document, themethod including the steps of:

receiving in a server, indicating data from a sensing device, operatedby the user, regarding the identity of the paper-based document and atleast one of a position and a movement of the sensing device relative tothe paper-based document;

processing the indicating data using a recognizer residing on the serverto produce intermediate format data; and,

transmitting the intermediate format data to an application;

wherein, the application decodes the intermediate format data intocomputer-readable format data using context information associated withthe paper-based document;

further wherein, the sensing device comprises:

(a) an image sensor adapted to capture images of at least some of thecoded data when the sensing device is placed in an operative positionrelative to the paper-based document; and

(b) a processor adapted to:

-   -   (i) identify at least some of the coded data from one or more of        the captured images;    -   (ii) decode at least some of the coded data; and    -   (iii) generate the indicating data using at least some of the        decoded coded data.

In a particular form of the invention, the particular field of thepaper-based document is associated with at least one zone of thepaper-based document, and the method includes identifying the contextinformation from the at least one zone.

According to a fourth broad form of the invention, there is provided asystem for computer-based recognition of natural language data, thesystem implemented on a network and comprising: a server to receivenatural language data generated by an input device via the network; and,a recognizer residing on the server to process the natural language datato produce intermediate format data; wherein, an application receivesthe intermediate format data and decodes the intermediate format datainto computer-readable format data using context information associatedwith the application.

In further particular forms of the invention, the input device is apen-based input device; the input device includes a microphone; thecontext information is derived from one or more of a document label, adocument setting, a document field label or a document field attribute;the intermediate format data is transmitted to more than oneapplication; and/or the application initiates the processing of thenatural language data.

According to a further aspect of the present invention there is provideda method for computer-based recognition of natural language data, themethod implemented on a network and comprising the steps of:

obtaining natural language data using an input device;

receiving the natural language data on a server via the network;

processing the natural language data using a recognizer residing on theserver to produce intermediate format data;

transmitting the intermediate format data to an application; and,

decoding the intermediate format data into computer-readable format datausing context information associated with the application.

According to a further aspect of the present invention there is provideda method of recognising digital ink input by a user into acomputer-based digital ink recognition system, the method including thesteps of:

providing a user with a paper-based document, the paper-based documenthaving disposed therein or thereon coded data indicative of a particularfield of the paper-based document and of at least one reference point ofthe paper-based document;

receiving in a server, indicating data from a sensing device, operatedby the user, regarding the identity of the paper-based document and atleast one of a position and a movement of the sensing device relative tothe paper-based document;

processing the indicating data using a recognizer residing on the serverto produce intermediate format data;

transmitting the intermediate format data to an application;

decoding the intermediate format data into computer-readable format datausing context information associated with the paper-based document;

wherein the sensing device comprises:

(a) an image sensor adapted to capture images of at least some of thecoded data when the sensing device is placed in an operative positionrelative to the paper-based document; and

(b) a processor adapted to:

-   -   (i) identify at least some of the coded data from one or more of        the captured images;    -   (ii) decode at least some of the coded data; and    -   (iii) generate the indicating data using at least some of the        decoded coded data.

According to a further aspect of the present invention there is provideda system for computer-based recognition of natural language data, thesystem implemented on a network and comprising:

an input device to generate natural language data;

a server to receive the natural language data via the network;

a recognizer residing on the server to process the natural language datato produce intermediate format data; and,

an application to receive the intermediate format data and to decode theintermediate format data into computer-readable format data usingcontext information associated with the application.

BRIEF DESCRIPTION OF FIGURES

The present invention should become apparent from the followingdescription, which is given by way of example only, of a preferred butnon-limiting embodiment thereof, described in connection with theaccompanying figures.

FIG. 1 (prior art) illustrates a distributed system architecture;

FIG. 2 (prior art) illustrates a flow chart of basic pattern recognitionsteps;

FIG. 3 illustrates an example processing system able to be used as aserver to house a recognizer, according to a particular embodiment ofthe present invention;

FIG. 4 illustrates an example distributed recognition system, accordingto a particular embodiment of the present invention;

FIG. 5 illustrates an example of ambiguous handwriting input for“clog”/“dog”;

FIG. 6 illustrates an example of ambiguous handwriting input for“tile”/“lite”;

FIG. 7 illustrates an example recognition scenario, according to aparticular embodiment of the present invention;

FIG. 8 illustrates an example recognizer selection scenario, accordingto a particular embodiment of the present invention;

FIG. 9 illustrates an example recognizer training scenario, according toa particular embodiment of the present invention;

FIG. 10 illustrates an example recognizer registration scenario,according to a particular embodiment of the present invention.

MODES FOR CARRYING OUT THE INVENTION

The following modes, given by way of example only, are described inorder to provide a more precise understanding of the subject matter ofthe present invention.

A particular embodiment of the present invention can be realised using aprocessing system, an example of which is shown in FIG. 3. Inparticular, the processing system 300 generally includes at least oneprocessor 302, or processing unit or plurality of processors, memory 304and at least one output device 308, coupled together via a bus or groupof buses 310. At least one storage device 314 which houses at least onedatabase 316 can also be provided, which may be remote and accessed viaa network. The memory 304 can be any form of memory device, for example,volatile or non-volatile memory, solid state storage devices, magneticdevices, etc. The processor 302 could include more than one distinctprocessing device, for example to handle different functions within theprocessing system 300.

Input device 306, for example a pen-based input device or a microphone,is normally remote to the system 300. Input device 306 is used by a userto generate natural language data 318 which is preferably transmittedover network 307 to system 300 for processing. Output device 308produces or generates intermediate format data 320, for example fortransmission over a network, to be transmitted to application 324, whichcould be remote or local to the system 300. The storage device 314 canbe any form of data or information storage means, for example, volatileor non-volatile memory, solid state storage devices, magnetic devices,etc.

In use, the processing system 300 may be a server and is adapted toallow data or information to be stored in and/or retrieved from, viawired or wireless communication means, the at least one database 316,which may be remote and accessed via a further network. The processor302 receives natural language data 318 from input device 306, preferablyvia network 307, and outputs intermediate format data 320 by utilisingoutput device 308, for example a network interface. The application 324may return decoded data to the processing system. The application 324may cause information to be printed, for example on a Netpage™ printer,at a user's location. More than one input device 306 can be provided. Itshould be appreciated that the processing system 300 may be any form ofterminal, server, specialised hardware, or the like. The processingsystem 300 may be a part of a networked communications system. Also, theapplication 324 may initiate transfer of natural language data 318 fromthe input device 306 to server 300.

In a particular embodiment, the server 300 is part of a system forcomputer-based recognition of natural language data, the systemimplemented on a network and comprising: the input device 306 to obtainnatural language data; server 300 to receive the natural language data318 via a network 307; a recognizer residing on the server 300 toprocess, in processor 302, the natural language data 318 to produceintermediate format data 320; and, an application 324 to receive theintermediate format data 320 and to decode the intermediate format data320 into computer-readable format data using context informationassociated with the application 324.

The following example provides a more detailed discussion of aparticular embodiment of the present invention. The example is intendedto be merely illustrative and not limiting to the scope of the presentinvention.

In a particular preferred embodiment, the present invention isconfigured to work with the Netpage™ networked computer system, adetailed description of which is given in the applicant's co-pendingapplications, including in particular, PCT Publication No. WO0242989entitled “Sensing Device” filed May 30, 2002, PCT Publication No.WO0242894 entitled “Interactive Printer” filed May 30, 2002, PCTPublication No. WO0214075 “Interface Surface Printer Using InvisibleInk” filed Feb. 21, 2002, PCT Publication No. WO0242950 “Apparatus ForInteraction With A Network Computer System” filed May 30, 2002, and PCTPublication No. WO03034276 entitled “Digital Ink Database SearchingUsing Handwriting Feature Synthesis” filed Apr. 24, 2003.

It will be appreciated that not every implementation will necessarilyembody all or even most of the specific details and extensions describedin these applications in relation to the basic system. However, thesystem is described in its most complete form to assist in understandingthe context in which the preferred embodiments and aspects of thepresent invention operate.

In brief summary, the preferred form of the Netpage system provides aninteractive paper-based interface to online information by utilizingpages of invisibly coded paper and an optically imaging pen. Each pagegenerated by the Netpage system is uniquely identified and stored on anetwork server, and all user interaction with the paper using theNetpage pen is captured, interpreted, and stored. Digital printingtechnology facilitates the on-demand printing of Netpage documents,allowing interactive applications to be developed. The Netpage printer,pen, and network infrastructure provide a paper-based alternative totraditional screen-based applications and online publishing services,and supports user-interface functionality such as hypertext navigationand form input.

Typically, a printer receives a document from a publisher or applicationprovider via a broadband connection, which is printed with an invisiblepattern of infrared tags that each encodes the location of the tag onthe page and a unique page identifier. As a user writes on the page, theimaging pen decodes these tags and converts the motion of the pen intodigital ink. The digital ink is transmitted over a wireless channel to arelay base station, and then sent to the network for processing andstorage. The system uses a stored description of the page to interpretthe digital ink, and performs the requested actions by interacting withan application.

Applications provide content to the user by publishing documents, andprocess the digital ink interactions submitted by the user. Typically,an application generates one or more interactive pages in response touser input, which are transmitted to the network to be stored, rendered,and finally printed as output to the user. The Netpage system allowssophisticated applications to be developed by providing services fordocument publishing, rendering, and delivery, authenticated transactionsand secure payments, handwriting recognition and digital ink searching,and user validation using biometric techniques such as signatureverification.

Distributed Pattern Recognition

An example architecture for a distributed pattern recognition system 400is depicted in FIG. 4. In the example, a signal 410 is recorded by aninput device 415 at a client layer 420 and transmitted over a network toa server (network layer 430) for recognition by a recognizer 440, withthe intermediate results 445 transmitted back to the client layer 420 ora third party application 450 on an application layer 455 forinterpretation and processing. One advantage of this approach is thatclient devices 415 and distributed applications 450 do not require thesignificant computing resources commonly needed to perform naturallanguage pattern recognition, and the network servers that perform therecognition are not subject to the resource constraints that areinherent in many client devices 415 (e.g. mobile phones,personal-digital assistants, imaging pens, etc.). As a result, networkservers are able to use extremely processor- and/or memory-intensivetechniques to improve recognition accuracy, and can use hardwareoptimised to perform the specific recognition task.

Performing pattern recognition on a centralized server (e.g. processingsystem 300) also offers an advantage to pattern-recognition systems thatemploy user-specific adaptation to achieve higher recognition rates. Forexample, some handwriting recognition techniques develop a handwritingmodel for each user of the system based on previous recognition results,which is then used to improve the future accuracy of the system for thatuser (see for example L. Schomaker, H. Teulings, E. Helsper, and G.Abbink, “Adaptive Recognition Of Online, Cursive Handwriting”,Proceedings of the Sixth International Conference on Handwriting andDrawing. Paris, July, 4-7 Telecom, (pp. 19-21), 1993 and S. Connell andA. K. Jain, “Writer Adaptation of Online Handwritten Models,” Proc. 5thInternational Conference on Document Analysis and Recognition,Bangalore, India, pp. 434-437, September 1999).

This adaptation is more effective if a single server, or set of servers,performs all recognition for a user (rather than a large number ofindividual applications each performing their own recognition), sincethe server is able to perform adaptation based on the input generated byall applications. In addition to this, centralized server-based patternrecognition simplifies the management of the recognition system 400 byallowing recognizers to be reconfigured and upgraded without interactionwith the distributed client devices 415 and applications 450, and allowstraining and test data to be easily collected.

However, the information required to perform the context processingstage of a pattern recognition system is generally application specificand is often very large (e.g. entries in a large application-specificdatabase), making it impractical to transmit the context information toa centralized server for processing. A solution to this problem is touse a mechanism for distributed recognition as depicted in FIG. 4. Whena user generates a signal (i.e. natural language data) 410 to berecognized and processed by an application, the signal 410 is submittedto a distributed server for processing. The server performs processingsteps such as pre-processing, segmentation, and classification (see FIG.2), but does not use a context model to decode the result (or onlyperforms partial decoding as described in the following discussion).Rather, the intermediate recognition results (i.e. intermediate formatdata) are returned or sent to the application allowing the applicationto apply any arbitrarily complex and domain-specific context processingto decode the signal.

Symbol DAG

One method of returning the intermediate recognition results (i.e.intermediate format data) to an application is to use a symbol DAG(Directed Acyclic Graph), which is a generic data structure thatcontains symbol and associated scores as vertices, and valid transitionsbetween symbols as edges. The structure can be implemented as atwo-dimensional array of elements, each of which defines the outputgenerated by the pattern classifier for a single segment of the signaland the associated valid transitions for that segment. This structurerepresents all the potential recognition alternatives that may bederived from the input signal based on the results of the classifier.The application uses this structure, in combination with a contextmodel, to decode the input signal.

The symbol DAG is equivalent to a matrix where each column contains theresults of the classification of a single segment of the input signal.Each element in the column represents the probability that theclassified segment is a particular symbol, and includes an offset thatindicates the next possible segment (column) in the input signal thatcan follow this symbol. Thus, the matrix represents all the possibledecoding paths based on the output of the pattern classifier. Thesepaths and associated classification scores can be combined with acontext model to fully decode the input signal.

Note that the symbol DAG is applicable in any pattern recognition taskwhere a sequence of classification results is decoded using a context orset of constraints. The symbols contained in the symbol DAG may be anyprimitive element that is generated as the output of a patternclassifier, including the output from a time-series classifier. Examplesof such recognition systems include handwriting and speech recognition,protein sequencing (see A. C. Camproux, P. Tuffery, S. Hazout, “HiddenMarkov Model Approach For Identifying The Modular Framework Of TheProtein Backbone”, Protein engineering, 12(12), pp. 1063, December1999), image processing and computer vision (see Y. He, A. Kundu, “2-DShape Classification Using Hidden Markov Model”, IEEE Transactions onPattern Analysis, 13(11), November 1991), and econometrics (see T.Ryden, T. Terasvirta, S. Asbrink, “Stylized Facts of Daily Return Seriesand the Hidden Markov Model”, Journal of Applied Econometrics, 13(3),pp. 217, May 1998).

Symbol DAG Example

As an example, Table 1 shows a symbol DAG that represents the outputfrom a handwritten character recognizer generated by the ambiguous textgiven in FIG. 5. In this example, the recognizer has found two possiblecharacter segmentation arrangements, as depicted by the two rows in thesymbol DAG. Note that in the examples, the symbol scores are given asprobabilities; however, an actual implementation may typically uselog-probabilities (i.e. the base-10 logarithm of the probability result)to improve the performance of context processing and to avoid overflowand underflow problems that occur when multiplying probabilities usingfinite precision floating-point operations.

To decode the alternatives, the context processor starts with the firstentry in the DAG (i.e. the character ‘c’). The score for this entry isadded to the accumulated total (since log-probabilities are added ratherthan multiplied), and processing moves to the column given by the offsetvalue in the entry (in this example, column 1). In column 1, twoalternatives exist (i.e. “cl” or “cb”), and the scores for thesealternatives are found by adding the scores to the previous total. Thedecoding continues until the end of the DAG is reached. Similarly, thesecond entry in column 0 (i.e. the character ‘d’) is decoded; notehowever, that column 1 is skipped in this traversal of the DAG, asindicated by the offset value of 2 in the character score entry. This isdue to the letter ‘d’ being constructed using two strokes, and thus therecognition of the letters ‘l’ and ‘b’ cannot be valid in thisalternative. Thus, the potential decoding alternatives in this exampleare:clog=0.7*0.8*1.0*1.0=0.56cbg=0.7*0.2*1.0=0.14dog=0.3*1.0*1.0=0.30

These values can now be combined with a language model or othercontextual information to select the most likely word. TABLE 1 ExampleDAG for “clog”/“dog” ambiguity 0 1 2 3 Character c l o g Offset 1 2 3 0Score 0.7 0.8 1.0 1.0 Character d b Offset 2 3 Score 0.3 0.2The DAG structure must ensure that strokes are assigned to an individualletter only once. To do this, alternate paths must be defined to ensurethat if a stroke is assigned to a letter, no subsequent letter may usethat stroke in its construction. An example of this is given in FIG. 6,with the derived DAG depicted in Table 2. In this example, the short,horizontal marks can potentially be recognized as crossbar elements of aletter ‘t’, or diacritical marks for the letter ‘i’. However, if amarking is used as a crossbar, it cannot subsequently be used as adiacritical. The potential decoding alternatives in this example are:tile=0.6*1.0*0.6*1.0=0.36tite=0.6*1.0*1.0*1.0=0.60lite=0.4*1.0*1.0*1.0=0.40

These values can now be combined with a language model to select themost likely word. TABLE 2 Example DAG for “lite”/“tile” ambiguity 0 1 23 4 5 Character t i i t l e Offset 1 4 3 5 5 — Score 0.6 1.0 1.0 1.0 0.61.0 Character 1 t Offset 2 5 Score 0.4 0.4

Additionally, the character value of a DAG entry can be set to zero,indicating a NUL character (i.e. a character that does not change thetext, but will modify the text probability). This allows word breakpositions (i.e. spaces) to be modeled as a SPACE/NUL pair, indicatingthat there is a certain probability that a space appears at that pointin the DAG. For example: TABLE 3 Example DAG for SPACE/NUL pair 0 1 2Character a NUL b Offset 1 1 — Score 1.0 0.6 1.0 Character SPACE Offset1 Score 0.4The potential decoding alternatives in this example are:ab=1.0*0.6*1.0=0.6a b=1.0*0.4*1.0=0.4

Distributed Recognizer Management

Referring to FIGS. 7 and 8, a distributed recognition system 700 maysupport a number of different recognizers 440 that are controlled by adistributed recognition management system or recognition manager 710.These recognizers 440 can include systems capable of supportingdifferent classes of recognition, such as different languages, dialects,or accents, or cursive or boxed input for handwriting systems. When anapplication 450 requires a recognition task to be performed, theapplication 450 first queries 720 the recognition manager 710 to find arecognizer 440 that matches the parameters of the input to be recognized(as depicted in FIG. 8). The recognition manager 710 then queries 730each recognizer 440 to find a recognizer that supports the parametersspecified by the application 450. When a recognizer 440 indicatessupport 740 (as opposed to no support 750 from recognizer 440 a in FIG.8) for the specified parameter set, the enumeration ends and theselected recognizer 440 (in the case of FIG. 8 recognizer 440 b) ispassed 760 to the application 450. Note that the individual recognizers440 do not need to be centralized and may be distributed throughout thesystem 700, since the recognition manager 710 acts as a controller forthe set of recognizers 440. The application 450 can then requestprocessing by the selected recognizer by passing or directing 770 thesignal and parameters to the selected recognizer 440. Intermediateformat data 445, i.e. a symbol lattice, is returned to the application450 and the application 450 can return a response 780 to the inputdevice 415.

User-Specific Dictionaries

Distributed recognition systems can also support user dictionaries,which are user-specific word lists (and possibly associated a-prioriprobabilities) that include words that a user writes frequently butwhich are unlikely to appear in a standard dictionary (examples includecompany names, work or personal interest specific terms, etc.). Userdictionaries can be stored and managed centrally so that words added tothe dictionary when using one application are available to allapplications for context processing. Obviously, applications can manageand use their own local user-specific dictionaries if required, sincethey have full control over context decoding.

When an application requires the recognition of a signal that maycontain words found in the user dictionary (e.g. standard handwrittentext input such as the subject line of an e-mail or an arbitrary voicemessage), the centralized recognition system generates the usualintermediate recognition results to be returned to the application forcontext decoding. However, in addition to this it decodes theintermediate results using the user-dictionary as a language model, theresult of which is also returned to the application. These twointermediate results structures can be combined by the applicationduring its context decoding to generate a final decoding that includesthe user-specific dictionary information.

User-Specific Training

Distributed recognition systems may also support user-specific trainingfor a recognizer 440, as depicted in FIG. 9. The data generated by auser-specific recognition training application is submitted 910 to thecentralized recognition manager 710, which stores 920 the data in adatabase 930. The recognition manager 710 then enumerates allrecognizers 440 to determine if they support the data format as definedby the parameters associated with the training data, and if so (Truesignal 940), submits the training data 950 to the recognizer 440 foruser-specific training.

When an existing recognizer is upgraded or a new recognizer is added tothe system, the recognition manager 710 queries 1010 the trainingdatabase 930 to determine if any training data 1020 of the formatrequired by the recognizer 440 exists. If so, the training data 1020 issubmitted to the newly registered recognizer 440 for processing, asdepicted in FIG. 10.

The invention may also be said to broadly consist in the parts, elementsand features referred to or indicated herein, individually orcollectively, in any or all combinations of two or more of the parts,elements or features, and wherein specific integers are mentioned hereinwhich have known equivalents in the art to which the invention relates,such known equivalents are deemed to be incorporated herein as ifindividually set forth.

Although a preferred embodiment has been described in detail, it shouldbe understood that various changes, substitutions, and alterations canbe made by one of ordinary skill in the art without departing from thescope of the present invention.

1. A method of providing computer-based recognition of natural languagedata, comprising the steps of: generating natural language data using aninput device; and, transmitting the natural language data to a servervia a network; wherein, the server is programmed and configured toprocess the natural language data using a recognizer residing on theserver to produce intermediate format data, and is further programmedand configured to transmit the intermediate format data to anapplication, and further wherein, the intermediate format data isdecoded into computer-readable format data using context informationassociated with the application.
 2. A method for computer-basedrecognition of natural language data, the method implemented on anetwork and comprising the steps of: obtaining natural language datausing an input device; receiving the natural language data on a servervia the network; processing the natural language data using a recognizerresiding on the server to produce intermediate format data; transmittingthe intermediate format data to an application; and, decoding theintermediate format data into computer-readable format data usingcontext information associated with the application.
 3. The method asclaimed in claim 1 or 2, wherein the natural language data is digitalink or speech.
 4. The method as claimed in claim 1 or 2, whereinprocessing the natural language data includes one or more of:normalizing the data; segmenting the data; and classifying the data. 5.The method as claimed in claim 1 or 2, wherein the recognizer isimplemented using software or hardware.
 6. The method as claimed inclaim 1 or 2, wherein the intermediate format data is a Directed AcyclicGraph (DAG) data structure.
 7. The method as claimed in claim 6, whereinthe DAG data structure is a matrix containing the processing results ofsegments of the natural language data.
 8. The method as claimed in claim1 or 2, wherein the intermediate format data includes segmentedtime-series classifier data.
 9. The method as claimed in claim 1 or 2,wherein the natural language data is derived from protein sequencing,image processing, computer vision or econometrics.
 10. The method asclaimed in claim 1 or 2, wherein the application is remote to both theinput device and the server.
 11. The method as claimed in claim 1 or 2,wherein the application resides on the server.
 12. The method as claimedin claim 1 or 2, wherein the context information is a user dictionary.13. The method as claimed in claim 1 or 2, wherein the recognizer can betrained for a specific user.
 14. The method as claimed in claim 1 or 2,wherein the input device is associated with a paper-based interfaceprovided with coded markings.
 15. The method as claimed in claim 14,wherein the coded markings are a pattern of infrared markings.
 16. Themethod as claimed in claim 14, wherein the input device is an opticallyimaging pen.
 17. The method as claimed in claim 14, wherein eachpaper-based interface is uniquely identified and stored on a networkserver.
 18. A method for computer-based recognition of natural languagedata, comprising the steps of: receiving natural language data at aserver from a remote input device; processing the natural language datausing a recognizer residing on the server to produce intermediate formatdata; and, transmitting the intermediate format data to an application;wherein, the application is programmed and configured to decode theintermediate format data into computer-readable format data usingcontext information associated with the application.
 19. A method ofproviding computer-based recognition of natural language data forinteraction with an application, wherein natural language data isreceived at a server from a remote input device, and the serverprocesses the natural language data using a recognizer residing on theserver to produce intermediate format data, the method comprising: theapplication receiving the intermediate format data from the server; and,the application decoding the intermediate format data intocomputer-readable format data using context information associated withthe application.
 20. A method of recognising digital ink input by a userinto a computer-based digital ink recognition system, the userinteracting with a paper-based document, the paper-based document havingdisposed therein or thereon coded data indicative of a particular fieldof the paper-based document and of at least one reference point of thepaper-based document, the method including the steps of: receiving in aserver, indicating data from a sensing device, operated by the user,regarding the identity of the paper-based document and at least one of aposition and a movement of the sensing device relative to thepaper-based document; processing the indicating data using a recognizerresiding on the server to produce intermediate format data; and,transmitting the intermediate format data to an application; wherein,the application decodes the intermediate format data intocomputer-readable format data using context information associated withthe paper-based document; further wherein, the sensing device comprises:(a) an image sensor adapted to capture images of at least some of thecoded data when the sensing device is placed in an operative positionrelative to the paper-based document; and (b) a processor adapted to:(i) identify at least some of the coded data from one or more of thecaptured images; (ii) decode at least some of the coded data; and (iii)generate the indicating data using at least some of the decoded codeddata.
 21. A method of recognising digital ink input by a user into acomputer-based digital ink recognition system, the method including thesteps of: providing a user with a paper-based document, the paper-baseddocument having disposed therein or thereon coded data indicative of aparticular field of the paper-based document and of at least onereference point of the paper-based document; receiving in a server,indicating data from a sensing device, operated by the user, regardingthe identity of the paper-based document and at least one of a positionand a movement of the sensing device relative to the paper-baseddocument; processing the indicating data using a recognizer residing onthe server to produce intermediate format data; transmitting theintermediate format data to an application; decoding the intermediateformat data into computer-readable format data using context informationassociated with the paper-based document; wherein the sensing devicecomprises: (a) an image sensor adapted to capture images of at leastsome of the coded data when the sensing device is placed in an operativeposition relative to the paper-based document; and (b) a processoradapted to: (i) identify at least some of the coded data from one ormore of the captured images; (ii) decode at least some of the codeddata; and (iii) generate the indicating data using at least some of thedecoded coded data.
 22. The method as claimed in claim 20 or 21, whereinthe particular field of the paper-based document is associated with atleast one zone of the paper-based document, and the method includesidentifying the context information from the at least one zone.
 23. Asystem for computer-based recognition of natural language data, thesystem implemented on a network and comprising: a server to receivenatural language data generated by an input device via the network; and,a recognizer residing on the server to process the natural language datato produce intermediate format data; wherein, an application receivesthe intermediate format data and decodes the intermediate format datainto computer-readable format data using context information associatedwith the application.
 24. A system for computer-based recognition ofnatural language data, the system implemented on a network andcomprising: an input device to generate natural language data; a serverto receive the natural language data via the network; a recognizerresiding on the server to process the natural language data to produceintermediate format data; and, an application to receive theintermediate format data and to decode the intermediate format data intocomputer-readable format data using context information associated withthe application.
 25. The system as claimed in claim 23 or 24, whereinthe input device is a pen-based input device.
 26. The system as claimedin claim 23 or 24, wherein the input device includes a microphone. 27.The system as claimed in claim 23 or 24, wherein the intermediate formatdata is transmitted to more than one application.
 28. The system asclaimed in claim 23 or 24, wherein the application initiates theprocessing of the natural language data.
 29. The system as claimed inclaim 23 or 24, including a recognizer manager to select a recognizerfrom a plurality of recognizers.